Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors

نویسندگان

E. F. D'Azevedo

V. L. Eijkhout

C. H. Romine

چکیده

The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. Since these inner products are mutually interdependent, in a distributed memory parallel environment their computation and subsequent distribution requires two separate communication and synchronization phases. In this paper, we present three related mathematically equivalent rearrangements of the standard algorithm that reduce the number of communication phases. We present empirical evidence that two of these rearrangements are numerically stable. This claim is further substantiated by a proof that one of the empirically stable rearrangements arises naturally in the symmetric Lanczos method for linear systems, which is equivalent to the conjugate gradient method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LAPACK Working Note ? LAPACK Block Factorization Algorithms on the Intel iPSC / 860 ∗

The aim of this project is to implement the basic factorization routines for solving linear systems of equations and least squares problems from LAPACK—namely, the blocked versions of LU with partial pivoting, QR, and Cholesky on a distributed-memory machine. We discuss our implementation of each of the algorithms and the results we obtained using varying orders of matrices and blocksizes.

متن کامل

LAPACK working note 51 Qualitative Properties of the Conjugate Gradient and Lanczos Methods in a Matrix Framework

This paper presents the conjugate gradient and Lanczos methods in a matrix framework, focusing mostly on orthogonality properties of the various vector sequences generated. Various aspects of the methods, such as choice of inner product, preconditioning, and relations to other iterative methods will be considered. Minimization properties of the methods and the fact that they can compute success...

متن کامل

Performance of Parallel Branch and Bound Algorithms on the KSR1 Multiprocessor

In this paper we consider the parallelization of the branch and bound (BB) algorithm with best-rst search strategy on the KSR1 shared-memory mul-tiprocessor. Two shared-memory parallel BB algorithms are implemented on a 56-processor system. Measurements indicate that the scalability of the two algorithms is limited by the cost of interprocessor communications and by the cost of synchronization....

متن کامل

LAPACK Working Note 58 The Design of Linear Algebra Libraries for High Performance Computers

This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under develo...

متن کامل

Lapack: Linear Algebra Software for Supercomputers 1

This paper presents an overview of the LAPACK library, a portable, public-domain library to solve the most common linear algebra problems. This library provides a uniformly designed set of sub-routines for solving systems of simultaneous linear equations, least-squares problems, and eigenvalue problems for dense and banded matrices. We elaborate on the design methodologies incorporated to make ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors

نویسندگان

چکیده

منابع مشابه

LAPACK Working Note ? LAPACK Block Factorization Algorithms on the Intel iPSC / 860 ∗

LAPACK working note 51 Qualitative Properties of the Conjugate Gradient and Lanczos Methods in a Matrix Framework

Performance of Parallel Branch and Bound Algorithms on the KSR1 Multiprocessor

LAPACK Working Note 58 The Design of Linear Algebra Libraries for High Performance Computers

Lapack: Linear Algebra Software for Supercomputers 1

عنوان ژورنال:

اشتراک گذاری